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The current meta-analysis investigated the extent to which personality traits changed as a result of 
intervention, with the primary focus on clinical interventions. We identified 207 studies that had tracked 
changes in measures of personality traits during interventions, including true experiments and prepost 
change designs. Interventions were associated with marked changes in personality trait measures over an 


average time of 24 weeks (e.g., d = .37). Additional analyses showed that the increases replicated across 


experimental and nonexperimental designs, for nonclinical interventions, and persisted in longitudinal 
follow-ups of samples beyond the course of intervention. Emotional stability was the primary trait 


domain showing changes as a result of therapy, followed by extraversion. The type of therapy employed 


was not strongly associated with the amount of change in personality traits. Patients presenting with 
anxiety disorders changed the most, and patients being treated for substance use changed the least. The 


relevance of the results for theory and social policy are discussed. 
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A significant body of evidence has accumulated to show that 
personality traits predict meaningful life outcomes, such as eco- 
nomic well-being, relationship success, health, and longevity (see, 
e.g., Borghans, Duckworth, Heckman, & Ter Weel, 2008; Ozer & 
Benet-Martinez, 2006; Roberts, Kuncel, Shiner, Caspi, & Gold- 
berg, 2007 for reviews). While important findings in their own 
right, the links between personality traits and important life out- 
comes invites the question of whether personality is fate. This 
question is really a question of whether personality traits change. 
If personality traits do change, then personality traits would not 
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necessarily be fate because people could develop in ways that 
might positively impact their future. 

The answer to the question of whether personality traits change 
is yes, and not just early in the life span. Indeed, multiple studies 
have now provided evidence that personality traits change in 
young adulthood (e.g., Neyer & Asendorpf, 2001; Robins, Fraley, 
Roberts, & Trzesniewski, 2001), middle age (e.g., Hill, Turiano, 
Mroczek, & Roberts, 2012), and even old age (e.g., Mõttus, 
Johnson, & Deary, 2012; Mroczek & Spiro, 2003; Small, Hertzog, 
Hultsch, & Dixon, 2003). Moreover, meta-analytic work has sug- 
gested that people become more confident, agreeable, conscien- 
tious, and emotionally stable with age (Roberts, Walton, & Viech- 
tbauer, 2006). That said, the rate of personality trait change 
demonstrated in all of these studies is rather modest. Therefore, 
normative changes in personality traits may provide little comfort 
for those starting life low on key traits, such as conscientiousness. 
By the time these individuals might catch up with their peers, they 
may be too old to take advantage of the opportunities afforded by 
growth. 

Nonetheless, the fact that personality traits do change through- 
out the life course invites a reasonable question: Can personality 
traits be changed through intervention? Moreover, if personality 
traits can be changed, how fast does the change occur and does the 
change remain once it has occurred? Within the field of personality 
psychology, there is a distinct lack of research investigating tech- 
niques to change personality traits, thus the question would appear 
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to have not been systematically addressed. However, in clinical 
psychology there is a long-standing, if relatively unappreciated, 
current of research that incorporates personality trait measures to 
test the effectiveness of various forms of therapeutic interventions 
on clinical outcomes, such as anxiety and depression. In almost all 
cases, the focus of these studies was not to change personality 
traits (cf. Tang et al., 2009), but simply to track the effectiveness 
of therapy while using a wide variety of measures. The effort to be 
comprehensive in assessing potential outcomes has resulted in a 
surprisingly large corpus of studies that include change in person- 
ality trait measures. 

The overarching question of whether measures of personality 
traits can and do change in therapy is not only an interesting 
question, but also one that is highly relevant to the theory and 
utility of personality traits. Theoretically and conceptually, the 
topic of personality trait development was not the focus of per- 
sonality research for many years because the field was dominated 
by extreme positions that characterized personality either as es- 
sentially fixed (an essentialist position) or as so inconsistent that it 
was inconsequential (contextualist; see Roberts & Caspi, 2001). 
Accordingly, dominant theoretical models of personality provide 
little theoretical help to explain the fact that personality traits 
exhibit both continuity and change. Recently, however, theoretical 
frameworks have been developed that treat personality traits as 
developmental constructs (Durbin & Hicks, 2014; Roberts & Jack- 
son, 2008). The central perspective within the more developmental 
take on personality traits is that they combine both continuity and 
change and typically change slowly and incrementally (Roberts, 
2006) with meaningful change taking course over the span of years 
rather than weeks. The existence of personality trait change that 
occurs within the time-line of a typical therapeutic context would 
challenge these assumptions that have emerged largely through the 
examination of long-term, passive longitudinal studies. 

The current article leverages the neglected, yet potentially im- 
portant body of research in clinical psychology. In the following 
sections we first review clinical psychology perspectives on the 
changeability of personality traits in the context of therapeutic 
interventions. Second, we present a meta-analysis of interventions 
that assessed personality trait change to quantify whether and how 
much personality traits changed as a result of therapy. We also 
examine critical issues, such as whether the change endured once 
people left the therapeutic setting. Finally, we discuss how the 
findings are relevant to theoretical and applied issues surrounding 
the assessment and use of personality traits. 


Can Personality Traits be Changed Through 
Clinical Intervention? 


There is a long-standing literature within clinical psychology 
that has directly or indirectly addressed the question of whether 
typical clinical interventions, such as cognitive—behavioral ther- 
apy or pharmacological treatments, can and do change personality 
traits (e.g., Bagby, Joffe, Parker, Kalemba, & Harkness, 1995; 
Borkovec et al., 1987). More recently, several authors have con- 
sidered more directly the topic of personality trait change in the 
context of therapeutic interventions (e.g., Barlow, Sauer-Zavala, 
Carl, Bullis, & Ellard, 2014; Clark et al., 2003; Quilty et al., 2008; 
Soskin Carl, Alpert, & Fava, 2012). Two opposing positions have 
emerged concerning the changes seen in personality traits during 


therapy. The assumption underlying both positions is that any 
given personality trait measure captures both state and trait vari- 
ance to some unknown degree. The first position, the state-artifact 
position, argues that any changes seen in personality trait measures 
that appear as the result of therapy can be attributed to the state- 
level variance in personality trait measures (Du, Bakish, Ravin- 
dran, & Hrdina, 2002; Gracious, 1999; Marchevsky, 1999). For 
example, an episode of depression would cause an inflection in 
personality trait scores because it would push down the state 
variance of a relevant trait-like emotional stability. Once the de- 
pressive state is lifted, one would expect trait scores to rise too. 
The increase in personality traits, such as emotional stability, 
would not be attributable to change in the trait component of the 
construct but the state component instead. Therefore, what looks 
like personality trait change (e.g., the decrease in emotional sta- 
bility associated with a major depressive episode and the subse- 
quent increase in the trait as the depression lifted) would only be 
temporary state changes that result from the fact that our trait 
measures are imperfect and capture both state and trait variance. 

Conversely, the cause-correction hypothesis (Soskin et al., 
2012) proposes that the changes demonstrated in psychological 
outcomes, such as depression, are the result of changes in the trait 
component and not the state component of personality. For exam- 
ple, a double-blind placebo control trial (Tang et al., 2009) showed 
that taking paroxetine (an antidepressant) resulted in both im- 
provement in depression and increases in emotional stability. Most 
importantly, the changes in emotional stability were differentiated 
from changes in depression and appeared to be the mechanism 
through which lasting changes in depression were made, as they 
mediated the effect of antidepressants on depression and long-term 
relapse. 

Other types of clinical and nonclinical intervention studies also 
demonstrate that personality traits are amenable to change across 
the life span. Training programs, in which participants learn some 
type of life skill, appear to be especially effective in changing 
personality traits. For example, a mindfulness intervention was 
associated with personality trait changes in conscientiousness, 
agreeableness, empathy, and emotional stability among medical 
residents (Krasner et al., 2009). Similarly, a social-skill training 
program for recovering substance abusers led to increases in 
agreeableness, conscientiousness, and emotional stability (Pied- 
mont, 2001; see also Oei & Jackson, 1980). Moreover, a cognitive 
training intervention for older adults was also associated with 
changes in a personality trait. Across 16 weeks, older adults 
learned inductive reasoning skills and completed 10 hr a week of 
crossword and Sudoku puzzles. Compared with a control condi- 
tion, the intervention increased participants’ levels of openness to 
experience (Jackson et al., 2012). 

The clinical perspectives on personality trait change in thera- 
peutic settings help to frame the research questions that need to be 
addressed to test whether the changes in personality traits are the 
result of changes in the state or trait structure of the construct. 
First, it is critical to both the state-artifact and cause-correction 
positions to show that personality trait measures change as a result 
of therapy. Thus, the first question is whether there is an associ- 
ation between attending therapy and personality trait change. But 
it is not enough to show that personality traits change during 
therapy as this type of effect could simply be attributable to 
changes in states. For example, a large number of studies described 
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below relate the experience of a specific form of therapy to 
personality trait change with no comparison group or control 
group. These types of results support the idea that personality traits 
can and do change quickly, but not the inference that the change is 
caused by therapy and not the result of the drift in states. A more 
stringent test of whether personality trait change is the result of 
therapy can be derived from experimental studies in clinical psy- 
chology. Experimental studies allow some advantage over obser- 
vational studies in this case, as the control groups in clinical 
studies in this particular data set are typically wait-list-control 
studies. These control groups may show naturalistic recovery that 
is most consistent with the state-artifact position. If experimental 
studies show no advantage of therapy over a wait-list control 
group, this would support the state-artifact position. But the ex- 
perimental effect on personality trait change, if it exists, could still 
be interpreted as state-level change, as it is possible that short-term 
therapy is truly more effective at changing state variance than 
“time,” the presumed ingredient in a wait-list condition. Therefore, 
we also examine evidence that changes that occurred in therapy 
were long-lasting. Temporary, as opposed to long-term, changes in 
trait measures would be strong support for the state-artifact posi- 
tion. If trait change persists over long periods of time, however, 
then this would be more consistent with therapy shifting enduring 
trait variance, rather than state variance. 

Given that personality trait change is complex with several 
alternative perspectives, we took a holistic approach to surveying 
the literature. No one effect size estimate or empirical trend is 
sufficient to determine the plausibility of therapeutically induced 
trait change. Instead, we integrate and interpret multiple sources of 
data drawing on complementary strengths of observational and 
experimental studies. 


The Current Study 


In early meta-analytic reviews, moderate changes in personality 
trait measures were reported as a result of psychotherapy (Jorm, 
1989; Smith, Glass, & Miller, 1980). Since these reviews, hun- 
dreds more studies have tracked personality traits and how they 
changed during therapy. These studies can be organized into 
several categories. Some studies were designed as true experi- 
ments with random assignment to either a control or an experi- 
mental group. Many other studies lacked a control group and 
simply assessed change that putatively arose through therapy in 
relevant psychological outcomes like depression, as well as in 
personality traits. Finally, another category of studies conducted 
long-term follow-ups of interventions to determine whether the 
changes lasted. 

The majority of studies used in the present review did not focus 
explicitly on changing personality traits. For example, it was 
common for researchers to default to including both the state and 
the trait versions of anxiety and anger measures (e.g., Spielberger, 
Jacobs, Russell, & Crane, 1983) because the inventories that 
measure these constructs often came in these two forms. In other 
studies, researchers chose scales that seemed therapeutically rele- 
vant, such as the Inventory of Interpersonal Problems (Horowitz et 
al., 1988), without considering that the scales of this inventory are 
essentially measures of pathological forms of low extraversion and 
low agreeableness. In some of the older studies, however, it 
appears that it was not considered unusual to track changes in 


personality traits as they were more often considered the focal 
point of intervention in the middle of the 20th century (Worchel & 
Byrne, 1964). 

Given the number of studies that have tested some form of 
intervention and shown that these interventions are associated with 
changes in personality traits, we have followed up on, and greatly 
expanded, the analyses originally reported in Smith et al. (1980). 
We conducted a meta-analysis of clinical and nonclinical interven- 
tions on personality trait change and added several dimensions to 
our analyses that were not addressed in earlier reviews. The first 
goal of this meta-analysis was to test the extent to which person- 
ality traits could be changed, even in relatively short periods of 
time common in clinical interventions, such as 12 to 15 weeks. It 
is common for researchers to assume that personality traits do not 
change quickly (e.g., Roberts, 2006), but this position is often 
derived from examining long-term passive longitudinal studies. 
Research based on passive longitudinal studies is problematic for 
making any inference about the time course of personality trait 
change, because they typically fail to track personality traits often 
enough to know whether traits change quickly or gradually over 
time. Researchers conducting passive longitudinal studies typi- 
cally wait years between assessments of personality traits under 
the assumption that change does not happen quickly (e.g., Roberts 
et al., 2006). However, by not assessing personality over shorter 
intervals, these studies provide limited evidence for how fast or 
slow personality trait change can occur. Change may happened 
gradually over the whole span of years, quickly at some point 
during the long period in which no measurements were taken, or 
some combination of these options varying across individual (e.g., 
those in therapy compared to those not in therapy). 

In spite of this shortcoming, there are some findings from 
passive longitudinal studies that presage the possibility of person- 
ality trait change can happen quickly. While most people fail to 
change much on the majority of traits over long periods of time, 
such as 8 years, most people change substantially on one trait out 
of five over this type of time span (Roberts et al., 2001). Though 
no longitudinal study has tracked personality trait change using 
short time intervals between assessments (i.e., weeks or months) 
over long periods of time, it is possible that traits could have 
changed quickly at any time in the long spans that typify longitu- 
dinal research, or gradually over the same time period. Research 
on clinical interventions that result in personality trait change 
would provide a much-needed perspective on whether change in 
personality traits is achievable in a shorter period of time. 

The second goal of this meta-analysis was to compare the 
state-artifact and cause-correction hypotheses to the best of our 
ability given the data. To this end, we structured the outcome data 
in several ways in order to highlight conditions that should provide 
evidence to support either perspective. First, we tested whether 
personality trait change was associated with clinical interventions, 
broadly construed. Second, we examined personality trait change 
in true experimental studies that contrasted change in experimental 
groups to change in control groups. In most clinical intervention 
studies, even control groups may show modest improvements in 
measures of psychopathology (Smith et al., 1980). The key ques- 
tion relevant to the state-artifact hypothesis is whether intervention 
groups change more than control groups. Presumably any changes 
that occur in a control group, which receives no treatment, reflect 
the natural relief from the temporary inflection in personality 
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caused by syndromes such as depression. Assuming that control 
groups in clinical settings, which are typically waitlist control 
groups, will improve without treatment, they provide an excellent 
comparison group for the state-artifact perspective. If both groups 
show the same amount of personality trait change, then there 
would be little reason to infer that the many nonexperimental 
studies showing personality trait change after therapeutic interven- 
tions are showing anything more than state change. 

Also relevant to the state-artifact and cause-correction hypoth- 
eses are the results from experimental designs in nonclinical stud- 
ies. The literature search resulted in the identification of a set of 
studies that were not clinical interventions per se; they either used 
a nonclinical intervention, a nonclinical group, or a combination of 
these two factors. Like the clinical studies, demonstrating that the 
intervention groups increased more than control provides a good 
test for drawing the causal inference that the intervention can lead 
to personality trait change rather than state change. But possibly 
more important is the pattern of change in the control groups for 
the nonclinical studies. Unlike the clinical studies, there would be 
no reason to expect significant changes in these control groups, as 
they would not have a disproportionate number of individuals with 
preexisting forms of psychopathology. On the other hand, these 
control groups would address other artifactual reasons for slight 
changes over time, such as testing effects from repeated adminis- 
tration of the personality inventory. A final way to distinguish the 
state-artifact and cause-correction positions is to examine how 
long the changes persist. If therapy only affects change in states, 
then we would expect the putative effect of therapy to wear off 
with time and for the personality trait scores to return to their 
baselines. In order to test these effects, we compiled the subset of 
studies that performed follow-ups of the intervention samples after 
the termination of therapy. 

In summary, this meta-analysis was designed to address a num- 
ber of enduring questions about personality trait change. This 
study is also positioned to address many other central questions 
about interventions and personality change. First, we tracked 
changes across the Big Five trait categories (extraversion, agree- 
ableness, conscientiousness, neuroticism, and openness to experi- 
ence) to see in what domain(s) people show the most change. 
Given the overriding importance of neuroticism to most clinical 
issues, we expected to find the largest effects for neuroticism in 
studies of clinical orientation (Kotov et al., 2010). We also ex- 
pected changes to occur on extraversion as it is strongly related to 
positive affect, and thus various forms of psychopathology, such as 
depression (e.g., Clark & Watson, 1991; Watson & Naragon- 
Gainey, 2014). Second, we sought to determine whether the type 
of therapeutic intervention, such as cognitive—behavioral therapy 
or supportive therapy, made any difference to the outcome. It is 
common in meta-analytic studies to find little difference between 
various forms of psychotherapy on clinical outcomes, such as 
depression or anxiety (Luborsky et al., 2002), but the question 
remains whether the same is true for personality traits as outcomes. 
If the findings for typical clinical outcomes, such as depression, 
hold for personality traits, we should expect little differential 
treatment effects for various forms of therapy. Third, we examined 
whether the type of disorder being treated moderated the amount 
of change found over time. Specific psychological disorders, such 
as eating disorders, are known to be very difficult to treat (Kaplan 
& Garfinkel, 1999). We sought to determine whether several 


additional factors, such as the length of the intervention, gender 
composition of the sample, and year of publication, moderated the 
amount of personality trait change. 


Method 


Literature Search Procedures 


The data sources for the current meta-analysis were journal 
articles as well as unpublished dissertations and theses. Multiple 
steps were taken to identify usable studies. First, we went through 
the reference list of Smith et al.’s (1980) seminal meta-analytic 
work on the effect of psychotherapy and all relevant reviews to 
identify studies that examined the change of personality traits as a 
result of interventions. Next, we conducted an inclusive search in 
the American Psychological Association’s PsycINFO database 
(1887-2012), Google Scholar, Web of Science, and the ProQuest 
Dissertations and Theses database for any combination of the key 
words psychotherapy and personality change, selective serotonin 
reuptake inhibitors and personality, SSRIs and personality, per- 
sonality and a specific SSRI (fluoxetine, fluvoxamine, citalopram, 
duloxetine, paroxetine, sertraline, escitalopram, venlafaxine), 
treatment outcomes and personality change. We inspected the 
citations from each usable study and review in our database as well 
as studies that referred to each study for additional qualified 
studies and continued this iterative process until no new study 
could be found. We also contacted researchers who had authored 
multiple studies that tracked personality trait change in clinical 
intervention studies to acquire any studies we had missed and any 
unpublished studies. Finally, we circulated a request for relevant 
studies to several list serves including the Society for Personality 
and Social Psychology, Association for Research in Personality, 
Division 12 of APA, and the Society for the Research on Psycho- 
pathology asking for unpublished research on personality trait 
change in intervention research. 

Studies were evaluated for inclusion in the current meta-analysis 
on the basis of the following criteria. First, the study needed to 
measure one or more personality trait variables. We chose a very 
specific operationalization of personality traits to make sure the 
focus was on measures typically thought of as traits. Specifically, 
we only included studies if the ratings of the items in the measures 
used were (a) global; (b) general in terms of overall functioning; 
and (c) focused on implicitly or explicitly enduring patterns of 
thoughts, feelings, and behaviors. A prototypical measure that 
fulfills these criteria asks people to rate general items about per- 
sonality (e.g., I’m a talkative person) using rating scales intended 
to capture general patterns (e.g., How characteristic or uncharac- 
teristic of you is this statement?). With these criteria applied, 
studies that examined state (e.g., affects, state anxiety) rather than 
trait variables, behavioral frequencies (e.g., frequency of aggres- 
sive behaviors), and social adjustment or maladjustment (e.g., 
marital satisfaction, relationship conflict) were excluded from the 
meta-analysis. Second, the study was considered for inclusion if it 
employed a prepost test design, that is, an experimental design 
such that personality trait change from pre- to posttest or the 
difference between control and treatment groups was reported. 
Third, effect sizes of personality trait change must have been 
reported in the study or able to be calculated from the reported 
results. Studies that reported incomplete results for calculating 
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effect sizes (e.g., trait scores for only one group at one time point, 
mean trait scores without any standard deviations) were excluded 
from our analysis. 

After applying these inclusion criteria, our searches resulted in 
207 studies (35 experimental studies) with a total of 357 samples 
(many studies included separate results for multiple samples), 
published or completed between 1959 and 2013. The overall 
sample of our meta-analytic database included a total of 20,024 
participants with an overall average of 63.41% female. The mean 
age of the samples ranged from 19 to 73 and averaged 36.04 years. 
Nearly one quarter of the samples (k = 77) followed up the 
participants longitudinally to monitor the long-term effect of the 
interventions with the longest follow-up occurring 16 years after 
therapy ended. Although we focused on clinical interventions, we 
found a large number of “nonclinical” studies that either tested (a) 
a nonclinical intervention (e.g., efforts to improve cognitive func- 
tioning); or (b) a clinical intervention, such as administration of 
antidepressants, applied to a healthy sample of adults. We exam- 
ined these studies separately. A full list of these studies is provided 
in supplementary Table 1. The data files used to estimate the 
results and supplementary tables can be found at https://osf.10/ 
Atys6/. 


Coding of Study Variables 


At the onset of the meta-analysis, we developed a detailed 
codebook for recording a range of characteristics of the studies 
(see online appendixes), the samples, the interventions used in the 
studies, the personality variables measured, and the effect sizes for 
personality change. Each sample from every usable study was 
coded for key study variables, including study design (experimen- 
tal design, prepost test design), type of treatment (cognitive— 
behavioral, supportive, psychoanalytic, pharmacological, hospital- 
ization, and mixed), duration of intervention (in weeks), time 
interval for the follow-up in longitudinal studies (in weeks), sam- 
ple mean age (in years), percentage of females in the sample, 
clinical and nonclinical sample type, type of psychological disor- 
ders/maladjustment presented (depression, anxiety, personality 
disorder, eating disorder, alcohol or drug abuse, and mixed), 
personality inventory used, and Big Five personality trait assessed. 
Several researchers coded the studies and agreement in coding was 
checked several times. Agreement was good for the coding of type 
of treatment (x = .79), and for type of presenting psychological 
disorders/maladjustment (x = .96). Agreement for computed ef- 
fect sizes for each study was also quite high (r = .98). Any 
disagreement in the ratings was addressed by discussion between 
the authors. 

Personality scales that did not directly assess the Big Five 
personality traits (e.g., the 18 scales in the California Personality 
Inventory) were sorted into corresponding Big Five categories, if 
any, based on multiple criteria: (a) information from studies that 
documented the development of the personality scales, (b) studies 
that examined the correlations among these personality scales and 
established Big Five measures, and (c) studies that used these 
personality scales in our database. When a personality scale was 
determined to be a combination of several Big Five trait dimen- 
sions, we coded the effect size as “blended” to reflect this deter- 
mination. This classification most often occurred for clinically 
oriented traits (e.g., hysteria, assaultiveness), evaluative traits (e.g., 


mistrust, socialization), or higher-order traits (e.g., well-being, 
harm avoidance). In the prepost dataset, effect sizes marked as 
blended were most frequently associated with neuroticism (n = 
248) and extraversion (n = 208), with substantially fewer obser- 
vations associated with agreeableness (n = 89), conscientiousness 
(n = 139), and openness to experience (n = 24). A total of 606 
observations were labeled blended. Three of the authors indepen- 
dently completed this process. Disagreement was resolved by 
discussion. A complete list of coding classifications for study 
variables is presented in supplemental Table 2. 


Analytical Procedures 


For the present analysis, we calculated effect sizes for per- 
sonality trait change using Cohen’s d (Cohen, 1988). For stud- 
ies that used prepost test design, effect sizes are computed as 
the difference between post- and pretest scores divided by the 
pretest standard deviation of these scores. For experimental 
studies, the effect sizes reflect postintervention difference be- 
tween the treatment group and the control group divided by 
their pooled standard deviation. We reversed the sign of the 
effect sizes for change in negative personality traits to ensure 
that the effect sizes were always positive when patients im- 
proved after intervention (i.e., score change on a measure of 
emotional stability was in the positive direction) and were 
negative when patients exhibited undesirable change after in- 
tervention (i.e., score change on a measure of emotional stabil- 
ity was in the negative direction). Effect sizes reported in 
metrics other than Cohen’s d (e.g., correlation coefficient r, 
student f statistic, one-way F statistic) were converted to Co- 
hen’s d before being analyzed. 

A total of 2,144 effect sizes were obtained from the usable 
studies in our meta-analytic database. In most cases, multiple 
effect sizes were available for each sample because several per- 
sonality scales (e.g., all five scales of the NEO Five Factor Inven- 
tory) were used to assess the participants. For the within-subject, 
prepost change scores, the variance estimates were based on 
Becker (1988, Equation 13). Because this equation requires knowl- 
edge of the test-retest correlation, which was almost never re- 
ported, we assumed an average test-retest of r = .5 and tested the 
robustness of the effect by systematically varying the test-retest 
correlations from .25 to .75.' We also calculated the 95% confi- 
dence intervals of our point estimates. A confidence interval not 
including zero means that the change of personality was significant 
at a = .05. We also separated the experimental effect sizes from 
the prepost effect sizes, as these two types of studies are only 
directly comparable under restrictive assumptions (Morris & De- 
Shon, 2002). 

Our meta-analytic search produced a large body of literature 
with evidence concerning personality change in response to ther- 
apy. We extracted the maximum amount of information from each 
article, resulting in a number of dependencies within the dataset. 
For example, most articles reported personality change across a 
number of dimensions, and some articles reported estimates of 
change across different waves of longitudinal follow-ups to the 
sample. Each effect size is informative, particularly for testing 


' We found no notable differences in our results using alternative esti- 
mated average test-retest stabilities so report only the estimates for r = .50. 
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moderator effects. To obtain robust effect sizes and standard 
errors, we applied two techniques. First, we used the COMPLEX 
survey option in Mplus 7.31 to correct standard errors for the 
nonindependence of observations derived from the same sample 
(Muthén & Muthén, 1985-2015). Second, we weighted the obser- 
vations by the inverse of the number of observations derived from 
a given sample as well as the inverse of the sampling variance 
associated with the observation. These approaches allow the use of 
the full dataset to estimate accurate effect sizes and standard errors. 

We used the general paradigm laid out by Cheung (2008) for 
random-effects and mixed-effects meta-analysis using struc- 
tural equation modeling. Random-effects meta-analysis is gen- 
erally considered the more robust and conservative meta- 
analytic approach compared with fixed-effects meta-analysis, 
which makes the assumption that all effect sizes are drawn from 
the same population. Given the tremendous diversity of studies 
included in the current meta-analysis, such an assumption is 
unlikely to hold. Using random-effects meta-analysis, between- 
study variance in effect sizes is estimated (denoted t in the 
results). Mixed-effects meta-analysis attempts to explain such 
between-study variance by incorporating study-level modera- 
tors, such as Big Five dimension, intervention type, or present- 
ing problem. Variance explained (i.e., R*) by the moderators 
can be calculated as the ratio of between-study variance when 
moderators are included compared to when moderators are not 
included (e.g., .3 variance with moderators compared with .6 
without implies the moderators explain 50% of the between- 
study differences). More broadly, our metaregression approach 
allows for the testing of moderators separately while simulta- 
neously taking into account the nested structure of the data to 
generate robust effect size estimates, regression coefficients for 
the moderators, and standard errors. 

Small study effects (i.e., statistical noise and publication 
bias) pose special problems for meta-analysis. Unfortunately, a 
good tool to adjust for various small study effects does not 
exist, and experts disagree about the circumstances under which 
adjustment for small study effects is appropriate. Techniques 
commonly used in psychological research, such as trim-and-fill 
(Duval & Tweedie, 2000) or the failsafe N (Rosenthal, 1979), 
have been thoroughly criticized (Moreno et al., 2009; Peters et 
al., 2007; Sutton, 2009). Others have argued for regression- 
based tests where precision (i.e., standard errors) is used to 
predict effect sizes (Egger, Davey Smith, Schneider, & Minder, 
1997). More recently, some have argued that the intercept of 
such a model, where the implied standard error is zero, indicates 
the effect size for the theoretical perfect study (Stanley, 2008; 
Stanley & Doucouliagos, 2014). Again, such tests have drawn 
support and criticism (Peters et al., 2010; Sterne et al., 2011). 
Most critically, Ioannidis (2008) argued that when effect sizes 
are even moderately heterogeneous, regression-based tests are 
“inappropriate, meaningless, or both” (p. 954). Effect sizes in 
the current study are heterogeneous, drawn from different pop- 
ulations, disorders, and treatments. This result is demonstrated 
empirically below. Recently, the performance of various bias- 
correction strategies have been put to the test under more 
realistic circumstances with underwhelming results (e.g., Reed, 
Florax, & Poot, 2015; Inzlicht, Gervais, & Berkman, 2015). The 
primary recommendation of this line of work is that some 
correction is better than none, but no one technique is unam- 


biguously better than others. Trustworthy effects should not be 
sensitive to the choice of correction. 

We employ two techniques to provide potentially more con- 
servative estimates of personality trait change, while acknowl- 
edging that such tests are incapable of providing exact metrics 
for personality trait change. First, we extend our random-effects 
meta-analysis to include the squared standard error as a mod- 
erator (PEESE model, following Stanley & Doucouliagos, 
2014). This approach adjusts for the potential for small studies 
to report large effect sizes by interpreting the intercept of the 
model. We integrate the predictor of study precision into our 
larger metaregression framework in order maintain consistency 
across analysis. Second, we complement the strictly statistical 
metaregression approach with a visual inspection of funnel 
plots. As can be seen in Figure S1, funnel plot asymmetry 
occurs for prepost effect sizes associated with standard errors 
greater than roughly .2. These relatively small studies tend to 
have substantially larger effect sizes. Therefore, we ran all 
models excluding effect sizes with standard errors greater than 
.2 (302 observations), leaving 83.71% of the original dataset. 
For comparison, the trim-and-fill approach indicated that there 
were eight effect sizes missing from the funnel plot, meaning 
our visual correction should be more conservative. Because the 
prepost dataset was so large, we were also interested in whether 
results held when only high precision studies were included. To 
test this question, we excluded effect sizes with standard errors 
greater than .1 (1,310 observations), leaving 29.34% of the 
original dataset. For comparison, Stanley, Jarrell, and Doucou- 
liagos (2010) suggested analyzing only the studies in the top 
tenth percentile of precision. Our criterion is not quite as 
extreme, but tends toward that direction. The experimental 
dataset was much smaller, and as can be seen in Figure S2, 
visual evidence for small study effects was less definitive. As a 
comparable test, we excluded effect sizes with standard errors 
greater than .4 (53 observations), leaving 76.86% of the original 
sample. In this case, the trim-and-fill approach indicated that 
there were two missing effect sizes. We did not test more 
stringent cutoffs because of the smaller pool of observations. If 
the levels of personality trait change identified in the meta- 
analysis are meaningful, the effects should be robust across a 
variety of analytic choices. 


Results 


Descriptive Statistics 


Supplementary Table 1 lists all 207 studies and the basic 
information about each study, including sample size, measures 
used, clinical versus nonclinical, type of therapy employed, 
outcomes, and how they were coded. The majority of studies 
were clinical interventions intended to treat some form of 
psychopathology. The most common disorder treated was de- 
pression, followed by anxiety and eating disorders. Nineteen 
studies examined change in personality traits in nonclinical 
samples. Table 1 provides descriptive statistics for the accumu- 
lated data. The average duration of therapy was 24 weeks (mode 
and median were 12 and 13 weeks, respectively), indicating that 
most studies tracked changes over a 3-to-6 month period, 
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PERSONALITY TRAIT CHANGE THROUGH INTERVENTION 


Table 1 


Descriptive Statistics of Meta-Analytic Database (K = 207) 


Average publication year 
Percentage of females 

Mean age of sample 

Duration of treatment (in weeks) 
K 

Total N 

Longest follow-up 


2001 (SD = 10.57) 

63.41 (SD = 20.10) 

36.04 (SD = 7.47) 

23.75 (SD = 35.96) 

207 

20,024 

Average (month): 6.80 

Number of studies that tracked changes beyond termination 
of therapy: 77 





Big Five categories 


Extraversion: 91 studies (18.35%) 


Agreeableness: 86 studies (14.14%) 
Conscientiousness: 74 studies (16.19%) 
Emotional stability: 199 studies (51.97%) 
Openness: 44 studies (4.86%) 

Blended: 67 studies (33%) 


Intervention type 


Pharmacological: 81 (22.69%) 


Cognitive-behavioral: 65 (18.21%) 
Supportive/humanistic: 13 (3.64%) 
Psychoanalytic: 39 (10.92%) 

Hospitalization not otherwise specified: 8 (2.24%) 
Mixed: 115: (32.21%) 


Presenting problem 


Depression: 74 (20.73%) 


Anxiety: 77 (21.57%) 
Personality disorder: 43 (12.04%) 
Eating disorder: 26 (7.28%) 
Substance use: 23 (6.44%) 
Mixed: 67 (18.77) 

Other: 4 (1.12%) 

None: 43 (12.04%) 


Number of studies with experiments 


35 out of 207 





Note. The parenthetical percentages for the Big Five categories, intervention type, and presentation problem 
represent the percentage of effect sizes when the entire data set was analyzed. Blended traits were those coded 


as reflective of more than a single trait domain. 


though a handful of studies tracked changes for a much longer 
time interval. 


Overall Effects of Interventions on Personality 
Trait Change 


We present results in two complementary ways. First, Table 2 
presents effect size expectations based on metaregression mod- 
els applied to prepost effect sizes. These effect sizes are derived 
from the model-implied metaregression equation, and the re- 
ported confidence intervals are useful for determining whether 
an effect size is different from zero. Second, supplementary 
tables report the regression parameters from our models. Be- 
cause we use effects coding for all (noncontinuous) analyses, 
the regression parameters represent deviations from the mid- 
point. Conceptually, these parameters indicate whether effect 
sizes within a specific moderator class (e.g., emotional stabil- 
ity) differ from the expected average of general effect sizes 
within that moderator. In this context, the reported confidence 
intervals are useful for determining whether a specific moder- 
ator class differs from the expected average, rather than zero. 
Figure 1 summarizes our results and integrates these two frames 
of reference. The solid lines represent the midpoint effect size 
for that class of moderators. The dots represent expected effect 
sizes for a specific class of effect sizes. The confidence inter- 
vals can be used to infer whether the specific class of effect size 
differs from the expected midpoint. Importantly, all analyses 


make use of the entire relevant dataset, allowing for direct 
comparisons of effect sizes while correcting for the nested 
structure of the data. Here, we give primary attention to the 
effect size estimates. 

Focusing first on the average prepost effect size, personality 
tended to change moderately in the studies, with a d of .37, 95% 
CI [.33, .40]. This effect was robust across different corrections 
for small study effects (minimum d = .21, 95% CI [.17, .24]). 
However, there was considerable evidence of effect size heter- 
ogeneity, with estimates of systematic between-study variance 
ranging from .22 to .27 across models (see Table S3). Thus, 
without partitioning the data in any way, and when taking into 
account publication bias, clinical and nonclinical interventions 
appear to change personality traits between one fifth to one 
third of a standard deviation, with robust evidence for hetero- 
geneity and the possibility that moderators affected those esti- 
mates. 


Findings Relevant to the State-Artifact and Cause- 
Correction Hypotheses 


We next turned to the particular study designs that provided 
information that could inform whether the evidence in totality 
supports the state-artifact or the cause-correction hypothesis about 
the effects of clinical interventions on personality trait change. 
Specifically, we focused on (a) true experimental designs, which 
contrast groups receiving treatment against those who do not; and 
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Table 2 
Pre-Post Personality Change Effect Size Estimates 


Full Model ES 


PEESE Model ES Visual Model High Power ES 











Moderator categories [95% CI] [95% CI] ES [95% CI] [95% CI] 

Panel 1: Average effect size 

Average effect size .37 [.33, .40] .21 [.17, .24] .31 [.28, .34] .28 [.24, .32] 
Panel 2: Clinical vs. Nonclinical vs. Control 

Clinical treatment .38 [.34, .42] .23 [.19, .26] .33 [.29, .36] .29 [.24, .33] 

Clinical control .24 [.14, .34] 0[-.11, 11] .19 [.10, .28] .17 [.09, .25] 

Nonclinical treatment .33 [.19, .46] .06 [—.05, .17] .25 [.14, .35] 

Nonclinical control —.03 [—.08, .03] —.24 [—.35, —.14] —.03 [—.09, .04] 

Comparison group .15 [.09, .21] —.06 [—.15, .02] .16 [.09, .22] 
Panel 3: Follow-up interval 

Immediate -34 [31,37] 18 [.14, .22] .29 [.26, .32] .28 [.24, .33] 

6 month 48 [.36, .60] 30 [.20, .41] SIET SL] 39 [1753] 

12 month .46 [.32, .60] .27 [.16, .37] .36 [.24, .49] .16 [.04, .27] 

1 year + .37 [.26, .47] 24 [.15, .32] .30 [.21, .40] .25 [.13, .36] 
Panel 4: Big Five 

Extraversion .23 [.17, .29] 14 [.08, .21] .22 [.16, .27] .22 [.14, .29] 

Agreeableness 15 [.11, .20] .06 [.02, .11] 14 [.10, .18] .12 [.07, .17] 

Conscientiousness 19 [.14, .23] -10 [.05, .15] 18 [.13, .23] 17 [.11, .24] 

Emotional stability .57 [.52, .62] 39 [.33, .44] 49 [.44, 54] 49 [.42, 55] 

Openness .13 [.07, .18] 04 [—.02, .09] .12 [.07, .17] .10 [.03, .17] 

Blended 27 [.23, .31] el] [.12;..21] 24 [.21, .28] .24 [.19, .29] 
Panel 5: Intervention type 

Pharmacological 31 [.26, .35] 14 [.09, .19] 25 [21429] :25 [.19, .31] 

Cognitive behavioral 46 [.37, .54] .26 [.18, .34] .36 [.28, .43] .34 [.23, .45] 

Supportive 49 [.36, .61] 31 [.18, .44] A5 [.33, .57] .37 [.36, .38] 

Psychodynamic .38 [.28, .48] .20 [.11, .29] .35 [.25, .44] .22 [.11, .34] 

Hospital .16 [.07, .26] .09 [.00, .18] .16 [.07, .25] .15 [.06, .25] 

Mixed Al [.34, .48] .26 [.20, .32] .36 [.29, .42] 31 [.24, .38] 
Panel 6: Presenting problem 

Depression 36 [.31, .41] sal [21,432] .35 [.30, .40] .34 [.28, .40] 

Anxiety .54 [.42, .66] 22 [.11,..32] .36 [.26, .46] .25 [.07, .43] 

Personality disorder 53 [.39, .67] 25 [.15, .36] .37 [.27, .48] .25 [.06, .44] 

Eating disorder 24 [.15, .32] .O7 [—.02, .16] .20 [.11, .29] .10 [.04, .16] 

Substance use .22 [.15, .29] .15 [.09, .22] .22 [.15, .29] .20 [.12, .28] 

Mixed 40 [.32, .48] .25 [.18, .32] 37 [.29, .44] 31 [.21, .41] 
Note. Expected effect sizes reported derived from meta-regression model. Full Model reports results based on weighted random effects model. PEESE 


Model reports results additionally controlling for the squared standard error to further correct for small study effects. Visual Model reports results based 
on a weighted random effects model where studies from asymmetrical portion of funnel plot are omitted (SEs > .2). High Power Model reports results based 
on a weighted random effects model only including high power studies (SEs < .1). ES = effect size; CI = 95% confidence interval. Due to limited data 
availability for nonclinical samples, the High Power Model for Panel 2 was estimated only for clinical effect sizes. The estimates of the prepost scores for 
the clinical treatment and control groups does not reflect the overall experimental effect because these estimates only include those studies that reported 


prepost scores which is a subset used to estimate the experimental effect. 


(b) the duration of the effects of interventions after they had been 
terminated. Table 3 presents effect size expectations for experi- 
mental effect sizes across the full dataset. On average, the expe- 
rience of psychotherapy led to an increase of .43, 95% CI [.30, .55] 
standard deviations compared to control groups.” The overall 
effect size was largely consistent across clinical and nonclinical 
settings (see Table S4, moderator p = .53). 

To better understand the experimental effects, we also computed 
the prepost changes for a set of control groups that were available in 
the set of studies examined. Even with random assignment, difference 
scores between intervention and control groups may be affected by 
differences in means that existed at the initiation of the study, which 
may bias the differences at the end of therapy. Tracking control group 
prepost change scores allows for a check on the experimental results. 
Most of these control groups came from the experimental studies, but 
several comparison group prepost changes also were reported in 
quasi-experimental studies in which the participants were tracked for 
time periods equivalent to those receiving treatment. As can be seen 


in Table 2, change patterns found in the various control groups were 
consistent with the inference that the interventions caused more 
change than not intervening. Specifically, the clinical control groups 
did show improvement, d = .24, 95% CI [.14, .34], the nonclinical 
and comparison groups showed markedly less change d = —.03, 95% 
CI [—.08, .03] and .15, 95% CI [.09, .21], respectively (see Table 2). 


? The effect size estimate was reduced only slightly when omitting low 
power studies, but the PEESE correction indicated that the effect may be 
substantially smaller (d = .13, 95% CI [—.10, .36]). However, the impre- 
cision of this estimate warrants some caution as the experimental dataset 
contained relatively few high precision studies (see Figure S2), a critical 
requirement for accurate PEESE estimates. As a symptom of this issue, one 
can inspect the increase in the confidence interval width when PEESE is 
applied. In the prepost dataset, this results in an increase of confidence 
interval width of 5.7%. For the experimental dataset, the increase is 89.4%. 
Taken as a whole, the experimental results are less robust than the prepost 
effects, but the more realistic visual correction still finds evidence of 
substantial treatment effects, d = .40, 95% CI [.28, .51]. 
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Effect Size (d) 





Sample Type 
—— Follow-up Interval 
——— Big Five Dimension 





Intervention Type 
Presenting Problem 
Pre-Post Effect Sizes 





Figure 1. 


These two differences between the clinical and other groups are 
statistically significant (see Table S3, p < .001 and p < .05, respec- 
tively). The findings for the nonclinical studies are highly relevant to 
the inference that interventions actually change personality traits as 
clinical studies are confounded by the nature of the participants 
seeking therapy. In contrast, people in nonclinical experiments would 
not necessarily be at a low point in their psychological functioning 
and would not be expected to be systematically affected by the simple 
passage of time. In sum, it appears that the prepost and experimental 
effects for interventions are associated with changes in personality 
traits that exceed what happens to individuals who are not receiving 
treatment. 

A second way of addressing the state-artifact hypothesis is to test 
whether the changes due to therapy are short-lived. Thus, long-term 
follow-ups of the intervention groups are critical. If the effect of 
therapy fades when treatment ends, then one would assume that 
therapy provided a short-term boost to positive affect or alleviation of 


Table 3 





Moderator Category 


Deviation plot for major moderators. 


negative affect, which should fade with time and, in turn, drag 
trait scores back down to prior levels of the true trait level. To evaluate 
this position, we broke the findings down by follow-up interval (see 
Table 2). As the results in Table 2 show, there appears to be no 
marked decrease in the effect sizes with time and follow-up interval. 
The average effect of treatment was d = .34, 95% CI [.31, .37] 
immediately following treatment, d = .48, 95% CI [.36, .60] for 
studies tracking changes zero to six months after treatment, d = .46, 
95% CI [.32, .60] for studies tracking outcomes six months to one 
year. Finally, the studies following up samples for an even longer 
period of time showed no decline in the effect of therapy, with the 
changes from one or more years after treatment being d = .37, 95% 
CI [.26, .47]. It should be noted that all of the effects across various 
follow-up intervals were robust to bias. Moreover, the moderator 
analysis as shown on Table S3 indicates that the only estimate that 
differed was the effect immediately following treatment (p = .02, 
other p’s > .17). 


Experimental Personality Change Effect Size Estimates Comparing Treatment to Control 


Moderator categories Full Model ES [95% CI] 


PEESE Model ES [95% CI] Visual Model ES [95% CI] 





Panel 1: Average effect size 





Average effect size 43 [.30, .55] .13 [-.10, .36] .40 [.28, .51] 
Panel 2: Clinical vs. Nonclinical 

Clinical .45 [.29, .61] .15 [—.08, .38] .40 [.25, .55] 

Nonclinical 36 L15537] .04 [—.33, .41] -39 [.20, .58] 
Panel 3: Big Five 

Extraversion .38 [.18, .58] 20:(=.11, 51] 39 [.19, .60] 

Agreeableness .23 [.08, .38] 03 (=.29,..35] .23 [.09, .38] 

Conscientiousness .06 [—.05, .16] —.18 [—.52, .16] -10 [.02, .18] 

Emotional stability .69 [.45, .93] .39 [.07, .70] 59 [.35, .82] 

Openness .36 [.23, .49] .24 [—.04, .52] .38 [.29, .46] 

Blended .27 [.10, .44] .O1 [—.26, .28] .27 [.10, .44] 
Note. Expected effect sizes reported derived from meta-regression models. Positive effect sizes indicate that personality changed more for treatment 


groups relative to control groups. Full Model reports results based on weighted random effects model. PEESE Model reports results additionally controlling 
for the squared standard error to further correct for small study effects. Visual Model reports results based on a weighted random effects model where studies 
from asymmetrical portion of funnel plot are omitted (SEs > .4). ES = effect size; CI = confidence interval. 
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Taken as a whole, the findings showed that clinical and non- 
clinical interventions resulted in increases in positive traits over 
very short time intervals in true experimental designs. Nonclinical 
control groups showed little evidence of placebo effects commonly 
seen in clinical control groups. Moreover, the changes tended to be 
retained following therapy for relatively long-time intervals. When 
combined, these results provide more support for the cause- 
correction hypothesis that therapeutic interventions may impart 
changes in personality traits. In contrast, the state-artifact hypoth- 
esis predicts that treatment and control groups would show similar 
personality increases due to artificially lowered means resulting 
from states, and because states are typically defined as lasting from 
moments to at most days, any personality change would not last a 
month, let alone over a year. 


Which Domain of Personality Trait Changes 
the Most? 


Next, we turned to the question of which trait domain changed 
the most in response to intervention (see Table 2). Clearly, we 
would expect the effect sizes to be larger for trait domains that are 
either implicitly or explicitly linked to the focus of the interven- 
tion. In the case of the clinical studies, most interventions con- 
cerned depression, anxiety, or eating disorders, which are comor- 
bid for depression and anxiety. Therefore, we expected larger 
changes on the trait domains most strongly linked to affect: neu- 
roticism (i.e., emotional stability reversed), which is associated 
with negative affect, and extraversion, which is closely aligned 
with positive affect. When broken out by the Big Five categories, 
the results for the clinical studies were consistent with these 
expectations. The largest effects were found for the domain of 
emotional stability, adjusted d = .57, 95% CI [.52, .62], followed 
by extraversion d = .23, 95% CI [.17, .29]. Blended traits, which 
represented a mixture of positive and negative emotionality, also 
tended to change more than other dimensions, d = .27, 95% Cl 
[.23, .31]. Of the remaining Big Five domains, agreeableness d = 
-15, 95% CI [.11, .20] and conscientiousness d = .19, 95% CI [.14, 
.23] showed increases that were different from zero. The changes 
found for openness to experience (d = .13, 95% CI [.07, .18]) were 
not robust across corrections for small study bias. In particular the 
PEESE correction indicated no evidence of change for openness. 
However, the remaining small study estimates did show an indi- 
cation that the changes in openness to experience were robust. 
Despite the clear pattern of expected effects for emotional stability 
and extraversion, only emotional stability had a significant mod- 
erator effect (Table $3, p < .001). Trait dimension explained a 
substantial portion of between-study variance (19%, Table S3), 
which was much larger than any other moderator (M = 2%). 

As these results for the Big Five reflect prepost change scores 
aggregated from mostly observational studies of the effectiveness 
of therapy, they do not provide evidence for differential causal 
effects of therapy for different domains of the Big Five. In order to 
identify potential causal effects on different Big Five domains, we 
aggregated the effects drawn from the experimental studies within 
each Big Five domain (see Table 3). Given the similarities of the 
effects across clinical and nonclinical studies, and because there a 
much smaller number of experimental studies, we aggregated all of 
the experimental studies together. Consistent with the results for 
prepost change scores, the largest effects were found for emotional 


stability, d = .69, 95% CI [.45, .92]. Also consistent with expec- 
tations, the second largest effect was found for extraversion, d = 
.38, 95% CI [.18, .58], with notable effect sizes found for the 
remaining Big Five domains. Blended traits also displayed rela- 
tively large increases, d = .27, 95% CI [.10, .44]. However, the 
only effects that were robust to the small study bias analyses were 
those found for emotional stability, such that any changes found in 
the remaining Big Five should be interpreted with caution. Similar 
to the prepost results, emotional stability had a significant moder- 
ator effect (p = .001), such that it was significantly different from 
the other estimates of change (Table S4). Again, trait dimension 
explained a large amount of between-study variance (43%, Table 
S4), which was much larger than other moderators (M = 4%). 


Does the Type of Therapy Matter? 


A number of other potential moderators and their combinations 
emerged in the compilation of the studies. Relevant to clinical 
psychology, many studies reported the effects of various psycho- 
social interventions, such as cognitive—behavioral therapy, or 
pharmacological interventions, largely focused on antidepressants. 
Consistent with the prevailing research on clinical outcomes, all 
forms of therapy, with the exception of hospitalization, showed 
similar levels of efficacy in changing personality traits (see Table 
2). The effects for supportive therapy, cognitive behavior therapy, 
and psychodynamic approaches all exceeded .38 and were largely 
indistinguishable from one another. Supportive and cognitive— 
behavioral therapy were associated with a slightly larger change 
than the average treatment (Table S3), but the difference in mag- 
nitude was small (difference in d of approximately .1, with p’s = 
.03 and .02, respectively). Psychopharmacological therapies did 
exhibit a slightly smaller effect than the remaining approaches, 
d = 31, 95% CI [.26, .35], p = .03. In contrast, being hospitalized 
resulted in the smallest amount of change, d = .16, 95% CI [.07, 
.26] and was statistically significantly different from the average 
(p < .001). All of the estimates for the different types of therapy 
were robust to small study bias. 


Does Type of Presenting Problem in Clinical 
Studies Matter? 


Table 2 also shows the overall magnitude of personality trait 
change broken down by the presenting problem being treated in 
the clinical intervention studies. We were able to compile enough 
studies to compare six types of presenting problems: depression, 
anxiety, personality disorder, eating disorders, substance use dis- 
orders, and mixed diagnoses. As can be seen by the effect size 
estimates in Table 2, the type of presenting problem being treated 
does appear to moderate the amount of personality trait change that 
occurred. Specifically, the changes seen in patients presenting with 
anxiety, d = .54, 95% CI [.42, .66], and personality disorders, d = 
53, 95% CI [.39, .67], were larger than those for patients present- 
ing with other disorders (both moderator effects p < .05, Table 
$3). Conversely, changes associated with eating disorders and 
substance use disorders were smaller than the remaining categories 
(both moderator effects p < .01, Table S3). In fact, the PEESE 
estimator for eating disorders was not distinguishable from zero. 
Effect sizes associated with depression and mixed diagnoses were 
at roughly the midpoint of the other conditions (both moderator 
effects p > .4, Table S3). 
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Focusing on Changes in Emotional Stability 


One potential biasing factor affecting the variability, or lack 
thereof, across the different moderators is the uneven distribution 
of types of measures used in individual studies. Some types of 
studies, for example, those focusing on supportive therapy, fo- 
cused more strongly on personality traits that show more change, 
such as emotional stability. This may have affected the average 
magnitude of the effects found for each type of intervention shown 
in Table 2. To put the different forms of therapy and presenting 
problems on more equal footing for comparison, we limited the 
analyses to only the domain of emotional stability (see Table 4). 
Consistent with this interpretation, the effect sizes for all forms of 
therapy and all presenting problems were larger when examining 
changes in emotional stability. The most conspicuous change 
across these analyses is that all effects were now robust to small 
study bias. Substantively, when examining only emotional stability 
outcomes, the three most effective therapies were cognitive— 
behavioral, supportive, and mixed therapeutic approaches. More- 
over, even hospitalization showed a robust association with im- 
provement on measures of emotional stability, d = .35, 95% CI 
[.19, .52]. Mirroring the earlier results, cognitive—behavioral ther- 
apy elicited slightly more change and hospitalization slightly less 
(both p < .01, Table $5) compared with the other interventions. 


Table 4 


The pattern of moderator effects for presenting problems be- 
came much more homogeneous when only examining emotional 
stability outcomes. While anxiety disorders did again show the 
largest effects, the effects of remaining categories were also quite 
large and robust to small study bias. Again, mirroring results from 
the full dataset, effect sizes associated with anxiety disorders 
displayed relatively large increases and substance use disorders 
displayed relatively smaller increases (both p < .01, Table S5) 
compared with the other disorders. 


How Long Does an Intervention Have to be to 
Change Personality? 


Extremely short interventions, lasting days rather than weeks, 
may be unlikely to change personality. To test this, we examined 
the relation between effect size and duration of intervention. Con- 
ceptually, a linear model is unlikely to hold as the effectiveness of 
therapy may reach an upper asymptote across time. Empirically, 
our data contained wide variability in intervention duration, and 
initial inspection of the scatterplot indicated that there may be an 
exponential relation between duration and effect size. Results are 
presented in Table S6 and graphically in Figure 2. The central 
time-related parameter was statistically significant in the full da- 
taset as well as in the emotional stability subset and across all 


Prepost Personality Change Estimates for Emotional Stability Effect Sizes 


Full Model ES 


PEESE Model ES Visual Model ES High Power ES 














Moderator categories [95% CI] [95% CI] [95% CI] [95% CI] 

Panel 1: Average effect size 

Average effect size 59 [.53, .64] -38 [.33, .44] 49 [.44, .54 48 [.41, .55] 
Panel 2: Clinical vs. Nonclinical vs. Control 

Clinical treatment .61 [.55, .67] Al [.35, .47] 51 [.46, .56] 

Clinical control .26 [.08, .44] —.02 [-.20, .16] .23 [.02, .44 

Nonclinical treatment 35 [.16, .55] -10 [-.11, .31] 31 [.12, .49 

Nonclinical control .09 [0, .17] —.08 [—.20, .05] .08 [0, .16] 

Comparison group .22 [.14, .29] —0 [-.07, .07] .22 [.13, .30] 
Panel 3: Follow-up interval 

Immediate .50 [.46, .55] 33 [.27, .38] 44 [.39, .48 45 [.39, 50] 

6 month .81 [.64, .97] .57 [.40, .75] .68 [.48, .88 -79 [.41, 1.17] 

12 month .82 [.70, .94] 58 [.48, .67] -69 [.58, .80 56 [.33, .79] 

1 year + -76 [.60, .92] -57 [.44, .70] -62 [.47, .77 -52 [.35, .68] 
Panel 4: Intervention type 

Pharmacological 48 [.38, .58] 31 [.23, .40] -39 [.32, .46] 

Cognitive behavioral .73 [.61, .86] A5 [.34, .55] 53 [.44, .62] 

Supportive .68 [.47, .88] 45 [.24, .66] 61 [.39, .83 

Psychodynamic 49 [.39, .59] 32 [.22, .41 45 [.36, .54 

Hospital 35 [.19, 52] .28 [.11, .45] 37 [.20, 53] 

Mixed .66 [.56, .75] 47 [.38, .56] 57 [.47, .67] 
Panel 5: Presenting problem 

Depression 59 [.51, .66] -50 [.43, .57] 58 [.50, .65 

Anxiety .75 [.61, .89] .36 [.22,..51] 51 [.37, .64] 

Personality disorder 59 [.45, .72] 35 [.23, .46] 46 [.37, .55] 

Eating disorder .61 [.40, .83] 39 [.14, .64 57 [.31, .83 

Substance use 35 [.24, .46] 28 [.18, .38 35 [.25,..45 

Mixed 55 [.44, .65] 39 [.28, .50] -50 [.39, .62] 
Note. Expected effect sizes reported derived from meta-regression models. Full Model reports results based on weighted random effects model. PEESE 


Model reports results additionally controlling for the squared standard error to further correct for small study effects. Visual Model reports results based 
on a weighted random effects model where studies from asymmetrical portion of funnel plot are omitted (SEs > .2). High Power Model reports results based 
on a weighted random effects model only including high power studies (SEs < .1). ES = effect size; CI = confidence interval. Due to limited coverage, 
we were unable to estimate the High Power Model for clinical vs. control, intervention type, or presenting problem. The estimates of the prepost scores 
for the clinical treatment and control groups does not reflect the overall experimental effect because these estimates only include those studies that reported 
prepost scores which is a subset used to estimate the experimental effect. 
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Figure 2. Exponential trends in treatment duration. 


corrections for small study effects. All analyses converge on the 
finding that interventions lasting less than approximately 4 weeks 
tend to have small effects. Beyond roughly 8 weeks, longer inter- 
ventions do not induce greater personality change. This result 
provides critical information about how quickly personality can 
change. 


Ancillary Moderators 


As noted, we also coded studies for several factors that, al- 
though not directly relevant to the issue of whether interventions 
change personality traits, may have acted as moderators of the 
findings. These moderators were year of publication, mean age of 
the sample, and percentage of the sample that was female. Publi- 
cation year was weakly associated with effect size, with an esti- 
mated increase of .04, 95% CI [.02, .06] per decade. Age of the 
sample was not associated with effect size (p = .63). Percent of the 
sample that was female was not associated with effect size (p = 
.56). We also examined whether samples that were entirely com- 
posed of females compared with entirely males differed in effect 
size, but the effect sizes did not differ, b = .03, 95% CI [—.02, 
.09]. Next, we conducted exploratory analyses to investigate 
whether gender composition moderated the effectiveness of treat- 
ment or responsiveness of certain disorders. Gender (i.e., percent- 
age of sample that was female) did not moderate the effectiveness 
of any treatment (p’s > .23). For presenting problem, studies 
focusing on eating disorders were almost entirely female (mini- 
mum percent female >90%). Therefore, we omitted eating disor- 
der effect sizes from this analysis. Once doing so, results indicated 
that personality traits changed less in studies of personality disor- 
ders when there was greater female representation in the sample 
(p = .04). However, upon inspection of the data, two studies were 
outliers in terms of very low female representation, and when 
omitted, the effect was no longer significant (p = .50), limiting 
confidence in the robustness of this result. Thus, gender compo- 
sition of the sample does not have direct or interactive effects on 
personality trait change. Because these results are largely null, we 
did not investigate small study bias. 


Discussion 


The current meta-analysis sought to investigate the extent to 
which personality traits changed as a result of intervention, with 
the primary focus on clinical interventions. Focusing exclusively 


on studies that included measures of personality traits, hundreds of 
studies were found and synthesized. One of the most salient 
features of the results was the pervasive positive and reasonably 
large patterns of change that were found on personality trait 
measures across various ways of examining the data. Interventions 
clearly are associated with changes in personality trait measures in 
the reported literature. 

The most pressing question is whether the interventions actually 
changed personality traits or resulted in some other type of change, 
such as changes in episodic states rather than traits. Two clear 
positions have been outlined in prior theoretical and conceptual 
work on clinical interventions and personality trait change—the 
state-artifact and the cause-correction positions. Both positions 
argue that personality traits will change during a typical clinical 
intervention. Where the positions diverge is on the meaning of that 
change. According to the state-artifact position, the changes in 
personality traits during therapy can be attributed to state-content 
contamination of typical trait measures. People who change their 
moods will also change their ratings on trait measures, but these 
types of changes do not reflect real trait change. In contrast, the 
cause-correction position posits that the changes in personality 
trait measures are enduring and have real consequences for clinical 
outcomes such as recovery and relapse (Tang et al., 2009). 

The findings of the primary studies contained in the meta- 
analysis offered various forms of evidence that, in aggregate, shine 
some light on whether the changes were the result of state-level 
contamination. First, and foremost, it was important to know 
whether interventions experienced by an experimental group—a 
true experimental condition—worked relative to a true control 
group. Fortunately, a group of studies tracked personality traits 
with both a clinical and a nonclinical focus. Across both types of 
interventions, experimental groups changed more than control 
groups and at magnitudes that were quite large relative to obser- 
vational studies of personality change over similar time periods. 
For that matter, the change in emotional stability, the most highly 
relevant trait domain to clinical interventions, was approximately 
half of the change expected over the entire life span based on 
typical passive observational studies (Roberts et al., 2006). If 
aggregated, it appears that most people increase on emotional 
stability approximately 1 standard deviation from young adulthood 
through middle age. Therapy lasting 4 or more weeks achieves half 
that amount of change. Moreover, patterns of change in the control 
groups helped to rule out the argument that the changes found in 
personality traits were simply the result of rebounding from a low 
point in one’s life as the changes in the experimental groups were 
much larger on average. It appears that interventions cause 
changes in personality trait ratings over the short-run. 

Although the findings from the true experimental studies bolster 
the inference that interventions can cause short-term changes in 
personality traits, this does not address the primary argument 
behind the state-artifact position that the changes caused by ther- 
apy are short-lived because they reflect changes in states, not traits. 
To further address this idea, we compiled studies that followed up 
on samples after the participants had terminated their therapy. 
Regardless of the timeline, the effects consistently supported the 
cause-correction position. There was no evidence that the effects 
of therapy faded with time. Specifically, personality levels re- 
mained altered after more than a full year by between .24 and .37 
standard deviations, and more impressively, emotional stability 
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levels remained altered after a similar time span by between .52 
and .76 standard deviations, depending on the correction for small 
study effects. So, within the confines of the studies contained in 
the current meta-analysis, it appears that the preponderance of data 
supports the cause-correction hypothesis and the inference that 
seeing a therapist can lead to lasting personality trait change. 

There is, however, one major caveat to this conclusion. It is 
possible that the changes that clinicians impart are not remaking 
someone’s personality. Rather, what a clinician might be doing is 
bringing people back to the baseline that existed before their 
episode of psychopathology. It is a fact that clinical intervention 
studies all start with people who are suffering, so we do not know 
what the participants in controlled intervention studies were like 
well before they experienced depression, anxiety, or some other 
form of psychopathology. We do know from passive longitudinal 
studies that people who go into therapy typically score higher on 
neuroticism prior to engaging in a therapeutic relationship (e.g., 
Liidtke et al., 2011). Unfortunately, this type of information is 
lacking for the entire corpus of intervention research since re- 
searchers do not have the ability to track entire populations out of 
which they select individuals for treatment. The key, unanswerable 
question is exactly where intervention study participants were on 
personality trait levels well before their clinical episodes. But this 
type of data is almost impossible to acquire. 

However, the set of nonclinical studies provides further evi- 
dence that interventions do lead to meaningful personality trait 
change. These studies did not focus on samples that had a prior 
history of psychopathology, and therefore, any change exhibited 
by these types of interventions provides stronger evidence for real 
change that goes beyond bringing a person back to their prior 
baseline. Put differently, it is hard to imagine how the state-artifact 
hypothesis could apply to these effect sizes. As we found, non- 
clinical groups changed just as much as clinical samples on per- 
sonality trait measures. Unfortunately, the set of studies in this 
category is much smaller, and none of these studies report long- 
term change patterns. It is still a possibility that these changes fade 
with time, though the evidence is suggestive that they do not. 

It is also the case that the ideal data to differentiate state-level 
change from trait-level change was not provided in a form that 
could be tested systematically in this data set. In particular, an 
ideal test of the differentiation of the state-artifact and cause- 
correction positions would be to assess change in both states and 
traits and control for the change in states when examining change 
in traits. Unfortunately, too few studies provided the necessary 
data? to test this more optimal approach to differentiating these 
two positions. Nonetheless, future research could help clarify this 
omission by examining state and trait change simultaneously. 

In sum, the data found in this set of studies provides tentative 
support for the idea that interventions do lead to personality trait 
change over time. Nonetheless, the data are not complete and 
without evidence such as the long-term efficacy of change inter- 
ventions in nonclinical samples, we believe it would be prudent to 
be cautious in making a strong case that clinical interventions 
change personality traits. 


Personality Traits Can Change Quickly 


Quite possibly the most significant finding of this research is the 
fact that personality traits can and do change more quickly than 


commonly thought. The examination of the relation between du- 
ration of treatment and change showed that most of the gains were 
made within the first month of therapy. This empirical fact con- 
tradicts the widely held assumption that personality traits typically 
change slowly and gradually over many years (Roberts, 2006). 
Conceptually, the idea of gradual change was consistent with the 
bottom-up process of change informed by theories such as the 
sociogenomic model of personality traits (Roberts & Jackson, 
2008). This model suggests that the slow accretion of small be- 
havioral, attitudinal, and emotional changes that occur over time 
eventually become internalized and automatized. The sociog- 
enomic assumption of slow and gradual change was derived from 
the large number of passive, observational, longitudinal studies on 
personality trait change, which show modest changes occurring 
over relatively long periods of time (Roberts et al., 2006). The 
findings from the current meta-analysis invite closer scrutiny of 
this conclusion. 

Specifically, the appearance of progressive, incremental, and 
small amounts of change made over long periods of time may have 
been an artifact of the study designs typically employed in longi- 
tudinal personality research. Almost no longitudinal studies have 
tracked personality traits continuously during the length of the 
longitudinal study. If we imagine combining the typical clinical 
intervention design (continuous assessments every few weeks or 
months) with the typical personality longitudinal design (assess- 
ments spaced out across long time intervals), we may see a second 
explanation for the seemingly modest changes in personality traits 
over long periods of time. The modest changes that appear in 
passive longitudinal studies may have resulted from a minority of 
individuals changing dramatically over short periods of time, with 
most others remaining roughly the same on trait levels. The typical 
passive longitudinal design would be insensitive to this possibility, 
and the resulting modest pattern of change could be a misrepre- 
sentation of the process of change that occurs. 

This possibility has been alluded to in several close examina- 
tions of change in passive longitudinal studies. Specifically, a set 
of longitudinal studies utilized a very conservative index of 
change, called the Reliable Change Index (RCI; Roberts et al., 
2001; Robins et al., 2001), which was created to determine 
whether the changes that resulted from therapeutic interventions 
were larger than would be expected by chance (Jacobson & Truax, 
1991). Interestingly, when the RCI index has been used, dramatic 
levels of change on personality traits have been found for most 
populations (Blonigen, Hicks, Krueger, Patrick, & Iacono, 2006; 
Pullman, Raudsepp, & Allik, 2006). More importantly, and rele- 
vant to the findings of the present study, only a minority of 
individuals shows dramatic changes on any given trait. In fact, the 
base rate of these dramatic changes has been estimated to happen 
on one in five traits over periods as long as 8 years (Roberts et al., 
2001). Thus, the modest changes found at a population level may 
reflect very large changes that occur in a small subset of any given 
population on any given trait—a finding consistent with the evi- 


3 Ideally, one would need state and trait measures such as state anxiety 
and trait anxiety and the correlation between each within the samples being 
followed. It is the latter information that is almost never reported in clinical 
intervention studies. We hope with the advent of online data repositories 
that this type of information can be recorded going forward. 
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dence from this study that personality traits can and do change in 
a short period of time. 

Unfortunately, given the assumptions that underlie personality 
traits, this possibility has never been tested because very few 
researchers have tracked personality traits both continuously and 
over a long period of time. What this highlights is the fact that 
within personality science, we do not know or understand much 
about the process of personality trait change because we have yet 
to reliably track the change as it happens. 

The solution to this oversight is relatively simple. Researchers 
need to conduct longitudinal studies of personality traits in which 
the traits or their concomitant behavioral, attitudinal, and affective 
states are assessed on a more regular basis, such as every week or 
month in order to capture when and how change occurs. It is 
possible that the process of personality trait change plays out in 
multiple ways, with some people changing slowly and incremen- 
tally, and others changing quickly and then solidifying the gains or 
losses they experienced and changing very little thereafter. These 
two process models of change—the incremental model and what 
we might describe as the “punctuated equilibrium” model of 
personality trait change—lie as untested explanations of the pro- 
cess of personality change. 


Potential Moderators of Personality Trait Change 


The clearest moderator of the effect of clinical interventions on 
personality trait change was the domain of the trait itself. Consis- 
tent with expectations, the two trait domains that changed the most 
were emotional stability (neuroticism) and extraversion for both 
clinical and nonclinical studies. These effects were seen in both the 
prepost change scores and the effects derived from true experi- 
ments. The magnitude of change in emotional stability was about 
one half of a standard deviation, which is notable, as it is roughly 
equivalent to half the gain typically made in these traits across an 
entire life course (Roberts et al., 2006). That clinical interventions, 
which are largely targeted at alleviating negative affect (e.g., 
anxiety) and addressing an absence of positive affect (e.g., depres- 
sion), would impart change largely in these two trait domains is not 
surprising. 

Of the remaining trait domains, we found modest changes that 
were not always robust to small sample bias. This mixed set of 
results would caution against strong arguments for the potential 
effect of therapy on trait domains other than extraversion and 
emotional stability. We view the lack of systematic change across 
all of the Big Five as a good sign. If we were to find substantial 
changes on all positive traits, it would invite the possibility that 
personality trait change that resulted from therapy reflected in- 
creases in self-presentation of being “better,” or the possibility of 
a global positivity effect of therapy (see Mu, Luo, Nickel, & 
Roberts, 2016). The relative specificity invites the inference that 
interventions might be tailored to address narrow traits within trait 
domains with high fidelity. 

We also considered two major potential moderators of the types 
of changes that would occur on personality traits: the form of 
therapy being used and the presenting problem being treated. 
Consistent with the Dodo Bird Effect (Wampold et al., 1997), we 
found little systematic difference between the type of therapy used 
to treat patients, especially when examining emotional stability as 
the outcome. Clinical research has consistently found that type of 


therapy is not strongly associated with differences in the efficacy 
of therapy (Luborsky et al., 2002). More to the point, therapy 
appears to lead to improvement regardless of what type of therapy 
is administered. Our findings for change in personality traits were 
surprisingly similar to prior research demonstrating little or no 
difference in the effectiveness of different forms of clinical inter- 
ventions, with the exception of people who were hospitalized for 
their problems. This latter group may be less relevant to evaluating 
therapies, as it may be a better indicator of what happens when 
people are stabilized after experiencing the most severe episodes 
of psychopathology. 

There were two apparent effects for moderation by presenting 
problems. Of the six categories of presenting problems, people 
presenting with anxiety-related issues and personality disorders 
showed the most change, and people presenting with eating dis- 
orders and substance use problems showed the least amount of 
change. These apparent differences were reduced markedly when 
we just examined emotional stability outcomes. This lack of dif- 
ference between different presenting problems was interesting 
because one of the presenting categories was for personality dis- 
order, which are often considered untreatable or difficult to treat at 
best. Therapists appear to be successfully changing individuals 
with personality disorders, even if therapists do not believe they 
are doing so. One possibility is that people with personality dis- 
order are experiencing a much wider set of problems that are more 
severe than people with other types of disorders are experiencing. 
This might mean that the changes, though real, are still somewhat 
dwarfed by the multitude of problems (e.g., interpersonal difficul- 
ties, problems in work and family) that someone faces when 
diagnosed with, for example, borderline personality disorder 
(Gunderson et al., 2011; Skodol et al., 2005). The answers to the 
questions like these raised by the review await more focused 
longitudinal and intervention research. 

Of the remaining moderators, we found that duration of treat- 
ment had a nonlinear relation with personality trait change, such 
that interventions that lasted less than 1 month were less effica- 
cious than those that lasted longer. We would caution making 
strong inferences based on these analyses as there were relatively 
few studies shorter than | month in duration. On the other hand, 
the asymptotic nature of long-term treatment is worthy of deeper 
study, as it implies that the effect of very long-term therapy may 
not be easily differentiated from more modest interventions. Year 
of publication, gender, and age did not have systematic relations to 
personality trait change. These null findings are interesting be- 
cause of their implications. First, there appears to be no decline 
effect in this literature given the lack of association between year 
of publication and the reported effect sizes of personality trait 
change. Decline effects often occur when the first reports of an 
effect are larger than the subsequent reports, often because the 
nature of the publication incentives rewards provocative, if less 
well-designed, studies when first reporting results. The lack of a 
systematic effect of gender indicates that therapies are not cur- 
rently differentially affecting men and women with respect to 
personality trait change. If one is hoping that current therapeutic 
approaches are equally effective across genders, then this is a 
positive finding. In contrast, if one hopes that certain therapies 
might be differentially effective with men or women due to tar- 
geting of unique etiological factors, these findings give no indica- 
tion that the types of therapies studied here help in that regard. The 
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lack of an age effect is also interesting as it supports the plasticity 
of personality across the life course given the samples ranged from 
adolescence through old age. The fact that therapies appeared to be 
just as efficacious with young, middle-aged, and old populations 
would appear to support the plasticity principle that personality is 
an open system and amenable to change even if change decreases 
with age (Roberts, Wood, & Caspi, 2008). 


Limitations and Future Directions 


One salient limitation of the compiled data set is the reliance on 
self-report measures of personality traits. We did find a handful of 
studies that used more than one method, and these few studies 
reported that observer ratings of patients (often performed by the 
therapists themselves) also increased in a positive direction (e.g., 
Hoglend et al., 2008). That said, the inference that therapy results 
in a truly noticeable improvement on personality traits would 
benefit greatly from research designs where friends and interview- 
ers who were not directly involved with the intervention were the 
source of observer ratings. Including observer ratings alongside 
self-reports would allow stronger inferences that personality traits 
are actually changing and not less compelling constructs, such as 
response sets. Moreover, developing some form of unobtrusive or 
objective index of personality traits that was not subject to self- 
presentational strategies or other biases would also benefit re- 
searchers’ ability to infer that interventions resulted in true change. 

Another limitation of the body of research concerns the con- 
struct validity of the changes that are found. That is, we do not 
know whether the changes incurred by therapy predict important 
outcomes. For example, in passive observational studies of per- 
sonality development, it is now known that changes in personality 
traits are predictive of important outcomes above and beyond 
original standing on those same traits (Moffitt et al., 2011; Mroc- 
zek & Spiro, 2007; Takahashi et al., 2013). Likewise, it would 
alleviate concerns about the changes in self-reported personality 
traits if the changes themselves predicted consequential outcomes 
beyond the therapeutic intervention, such as later relapse (e.g., 
Tang et al., 2009; Vittengl, Clark, & Jarrett, 2010; Vittengl, Clark, 
Thase, & Jarrett, 2015). 

The heterogeneity of our data was also an issue. Our meta- 
analytic results found substantial mean-level change in person- 
ality, but this change was not uniform over all of the observed 
studies. For example, we demonstrated that change was more 
pronounced for measures of emotional stability. However, sub- 
stantial between-study variance remained even after including 
coded moderators. On the one hand, this result might be ex- 
pected as we drew on an extremely diverse set of studies. We 
included studies of any personality dimension in any setting for 
any disorder (or no disorder), as long as an intervention was 
applied. On the other hand, between-study variance has two 
important statistical implications. First, between-study variance 
reduces statistical power (Pigott, 2012). Because we drew on a 
very large body of research, statistical power was not a large 
concern for our analyses, and indeed, we report narrow confi- 
dence intervals reflecting this fact. Second, large between-study 
variance implies that our aggregate effect sizes do not apply 
universally, but rather are the average expected effect size. Our 
results imply that there are many situations in which one would 
expect smaller effect sizes than the average, and many other 


situations in which one would expect larger effect sizes than the 
average. Apart from personality dimension, our coded moder- 
ators accounted for relatively little of the between-study vari- 
ance. Future work in this area will be necessary to tease apart 
the factors that lead to such variance. 

Another limitation of the compiled data was the pervasive 
evidence for publication bias throughout many of the categories 
that were analyzed. According to the PEESE test and the funnel 
plot analyses, many of the estimates showed some signs of 
publication bias. The bias, though pervasive, was not enough to 
eliminate the effects of therapy on personality trait change 
when a variety of adjustments were made. Nonetheless, it 
indicates a need for more preregistered, controlled studies con- 
ducted by individuals who are not motivated to show the 
effectiveness of any given therapy or intervention. Also, most 
of the studies in the data set were powered only to detect 
medium or large effects. Larger sample sizes would potentially 
enhance the accuracy of future estimates of change. Moreover, 
most research examined interventions drawn from programs 
designed at a specific hospital or clinic. One opportunity af- 
forded by focusing on personality traits, rather than psychopa- 
thology, is that interventions could be implemented outside of 
institutions that employ interventions as a rule and thus provide 
a more objective evaluation of the intervention efficacy. 

The results of the study also challenge future researchers to 
create an empirical edifice that is more detailed and thus more 
easily tested and refuted. Specifically, we know very little about 
the magnitude of variability in personality and behavior across 
different time periods. The current study is novel in large part 
because personality developmental researchers have failed to con- 
sider that change could occur in very short periods of time when 
designing their longitudinal studies. Thus, we simply do not have 
data on how personality trait measures behave over weeks, 
months, and even a year because the assumptions have always 
been that years were needed to detect development. Without this 
information, it is difficult to gain perspective on the amount of 
change we found in this study and the absolute minimum amount 
of time sufficient for personality trait change to occur. While a half 
of one standard deviation appears large from the perspective of the 
average effect sizes in psychology, we still do not know whether 
that is meaningful. Mapping the magnitude of change across time 
and measures is a basic task that is in dire need of attention. 

Another key necessity for future research is to identify the 
mechanisms responsible for personality change in the therapeutic 
setting. It is tempting to endorse specific theoretical and concep- 
tual explanations typically invoked in clinical research. For exam- 
ple, one could propose that the cognitive reorganization intrinsic to 
a cognitive—behavioral approach to therapy is a viable mechanism 
that helps impart personality trait change. However, the fact that 
personality trait change happened across all types of therapeutic 
intervention militates against this argument. People clearly change 
across many different constructs as a result of going to see thera- 
pists, but the explanations for why remain elusive, especially in the 
case of personality trait change. 


Conclusion 


Modern personality trait theories have successfully moved be- 
yond the false dichotomies posed by prior generations of research- 
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ers that painted a picture in which personality traits were either 
perfectly stable or permanently variable (Roberts, 2009). Person- 
ality traits not only show robust evidence for change across the life 
course, but also show meaningful changes in relation to life expe- 
riences (Roberts & Mroczek, 2008). Moreover, theoretical systems 
have developed that move modern trait theory to a stage in which 
these changes are not only possible, but it is possible to entertain 
the more provocative question of whether personality traits can be 
changed through specific interventions. 

As this review has shown, the contemporary take on personality 
traits simply catches up with the wealth of evidence that has been 
accumulating in clinical science for decades. Clinicians and other 
interventionists have been changing personality traits for many years. 
Appropriately, clinical researchers have identified personality traits, 
primarily neuroticism, as key contributors to psychopathology (La- 
hey, 2009) and suggested that they should be the primary focus of 
interventions given their widespread relation to various forms of 
psychopathology (Barlow et al., 2014). It is time that these findings 
are fully appreciated in both personality psychology and other 
fields—such as economics, political science, and health psychol- 
ogy—that use personality traits as tools. Personality traits are not 
only robust predictors of important life outcomes, but also appear 
to be amenable to intervention. This fact opens the door to a new 
era of research that more strongly links personality and clinical 
psychology, and other groups, such as educational psychologists 
and economists, who are interested in changing people and their 
behaviors in order to help them with their lives. 
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